Skip to main content

API reference for Demo RAG

Get Started

Introduction

You can interact with the API through HTTP requests from any language.

Authentication

None as of now

info

Request for API KEY to use AI Foundation Services RAG APIs.

Endpoints

Session Management

Session endpoints are used to manage user chat sessions.

List Sessions

GET http://<hostname>:<port>/api/v3/session/<user_id>

Returns list of user sessions belongs to the user.

Path Parameters
user_id: string Required
    Unique user identifier

Returns
List of session objects

Example Request

curl -X 'GET' \
'<http://localhost:8032/api/v3/session/test_1'> \
-H 'accept: application/json'

Response

[
{
"sessionId": "session_1",
"sessionTitle": "session_1",
"updatedAt": "2024-04-19T15:59:06.902000"
}
]

Create Session

POST http://<hostname>:<port>/api/v3/session

Creates a session for user to start conversations with LLM.

Request Body
userId : string Required
    Unique user identifier
sessionId: string Required
    Unique session identifier

Returns
UserId/SessionId Message

Example Request

curl -X 'POST' \
'<http://localhost:8032/api/v3/session'> \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"userId": "test_1",
"sessionId": "session_1"
}'

Response

{
"id": "test_1/session_1"
}

Remove Session

DELETE http://<hostname>:<port>/api/v3/session

Removes an existing session created by user.

Request Body
userId : string Required
    Unique user identifier

sessionId: string Required
    Unique session identifier

Returns
Message

Example Request

curl -X 'DELETE' \
'<http://localhost:8032/api/v3/session'> \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"userId": "test_1",
"sessionId": "session_2"
}'

Response

{
"message": "Session successfully deleted"
}

Configuration Management

Get Models

GET http://<hostname>:<port>/api/v3/models

Returns list of available LLMs.

Returns
List of models.

Example Request

curl -X 'GET' \
'<http://localhost:8032/api/v3/models'> \
-H 'accept: application/json'

Response

[
"GPT-3.5",
"GPT-4",
"Llama-2",
"Mixtral",
"CodeLlama-2"
]

Get Configuration

GET http://<hostname>:<port>/api/v3/conf/{user_id}/{session_id}

Returns configuration set by user in a particular session.

Path Parameters
userId : string Required
    Unique user identifier

sessionId: string Required
    Unique session identifier

Returns
Smartchat configuration object

Example Request

curl -X 'GET' \
'<http://localhost:8032/api/v3/conf/test_1/session_1'> \
-H 'accept: application/json'

Response

{
"userId": "test_1",
"sessionId": "session_1",
"modelSettings": {
"promptTemplate": "Answer the following question: ",
"temperature": 0.1,
"chatModel": "Mixtral",
"topP": 0.95,
"topK": 40,
"stream": true,
"maxTokens": 2048,
"similarityThreshold": -99,
"chunkSimilarityTopK": 5
}
}

Update Configuration

POST http://<hostname>:<port>/api/v3/conf

Updates configuration for a session to tune the parameters for LLM and RAG retrieval.

Request Body
userId : string Required
    Unique user identifier

sessionId: string Required
    Unique session identifier

modelSettings: object Required

promptTemplate: string Optional
    System prompt to specify certain instructions to the LLM

temperature: number Optional
    What sampling temperature to use, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic.

chatModel: string Required
    Name of the LLM to be used in the session

topP: number Optional
    An alternative to sampling with temperature, called nucleus sampling, where the model considers the results of the tokens with topP probability mass. So 0.1 means only the tokens comprising the top 10% probability mass are considered.

topK: integer Optional
    Sampling samples tokens with the highest probabilities until the specified number of tokens is reached

stream: boolean Optional
    If set, partial message deltas will be sent, like in ChatGPT. Tokens will be sent as data-only as they become available.

maxTokens: integer Optional
    The maximum number of tokens that can be generated in the chat completion. The total length of input tokens and generated tokens is limited by the model's context length.

similarityThreshold: number Optional
    The similarity threshold below which the retrieved nodes will be dropped and not be used for RAG.

chunkSimilarityTopK: integer Optional
    The number of nodes/chunks to be retrieved from the Vector Store for RAG.

Returns
Message

Example Request

curl -X 'POST' \
'<http://localhost:8032/api/v3/conf'> \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"userId": "test_1",
"sessionId": "session_1",
"modelSettings": {"chatModel": "Llama-2"}
}'

Response

{
"message": "Config saved successfully!"
}

File Management

Upload File

POST http://<hostname>:<port>/api/v3/file/user/{user_id}

Uploads the file from user local system to application environment. The size of the file should be maximum 7 MB.

Path Parameters
user_id: string Required
    Unique user identifier

Request Body
file: object Required
    The File object (not file name) to be uploaded.

Returns
Message

Example Request

curl -X 'POST' \
'<http://localhost:8032/api/v3/file/user/test_1'> \
-H 'accept: application/json' \
-H 'Content-Type: multipart/form-data' \
-F 'file=@Service.pdf;type=application/pdf'

Response

[
"success"
]

Remove File

DELETE http://<hostname>:<port>/api/v3/file/user/{user_id}?{file_name}

Removes a file from physical path on application environment.

Path Parameters
user_id: string Required
    Unique user identifier

file: object Required
    _The file name to be deleted.

Returns
None

Example Request

curl -X 'DELETE' \
'<http://localhost:8032/api/v3/file/user/test_1?file=Service.pdf'> \
-H 'accept: application/json'

Context Management

Get Session Context

GET http://<hostname>:<port>/api/v3/context/current/{user_id}/{session_id}

Returns the files/contexts added to a particular user session.

Path Parameters
user_id : string Required
    Unique user identifier

session_id: string Required
    Unique session identifier

Returns
Context object.

Example Request

curl -X 'GET' \
'<http://localhost:8032/api/v3/context/current/test_1/session_1'> \
-H 'accept: application/json'

Response

[
{
"name": "bedienungsanleitung-speedport-smart-3.pdf",
"hash": "a04a15693997a4fee349e90fc2e7b0c8",
"path": "uploads/test_1/doc/bedienungsanleitung-speedport-smart-3.pdf",
"user_id": "test_1",
"created_at": "2024-04-19T15:50:09.103000",
"updated_at": "2024-04-19T15:50:09.103000",
"origin_name": "bedienungsanleitung-speedport-smart-3.pdf",
"session_id": "session_1",
"id": "662feadf94ec3e93b7111a04",
"status": null
}
]

Add Session Context

POST http://<hostname>:<port>/api/v3/context/file

Adds a file/context to a particular user session.

Request Body
userId : string Required
    Unique user identifier

sessionId: string Required
    Unique session identifier

files: list Required
name: string Required
    Name of the file

metadata: object Optional
    Metadata like author, category, etc.

Returns
Docstore insert Id

Example Request

curl -X 'POST' \
'<http://localhost:8032/api/v3/context/file'> \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"userId": "test_1",
"sessionId": "session_1",
"files": [
{
"name": "otc.pdf",
"metadata": {
"Author": "Open Telekom Cloud"
}
}
]
}'

Response

{
"id": [
"662fed349a0d476944a8f1ef"
]
}

Remove Session Context

DELETE http://<hostname>:<port>/api/v3/context/file

Removes a file/context from a particular user session.

Request Body
userId : string Required
    Unique user identifier

sessionId: string Required
    Unique session identifier

removeSource: boolean Required
    Set to true if physical file is also required to be removed.

files: list Required
    List of file names to be removed from context.

Returns
Message

Example Request

curl -X 'DELETE' \
'<http://localhost:8032/api/v3/context/file'> \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"userId": "test_1",
"sessionId": "session_1",
"removeSource": false,
"files": [
"otc.pdf"
]
}'

Response

{
"message": "User context successfully deleted! Warning: None"
}

Chat

AskChat

POST http://<hostname>:<port>/api/v3/askChat

Request Body
userId : string Required
    Unique user identifier

sessionId: string Required
    Unique session identifier

prompt: string Required
    User question

Returns
LLM response object with RAG sources and nodes content 

Example Request

curl -X 'POST' \
'<http://localhost:8032/api/v3/askChat'> \
-H 'accept: application/json' \
-H 'Content-Type: application/json' \
-d '{
"userId": "test_1",
"sessionId": "session_1",
"prompt": "What is OTC?"
}'

Response

{
"userId": "test_1",
"prompt": "What is OTC?",
"response": "OTC stands for Open Telekom Cloud, which is a cloud computing service provided by Deutsche Telekom AG. It offers a range of computing, storage, and network services to businesses and individuals.",
"sources": [
{
"pageLabel": "13",
"fileName": "otc.pdf",
"metadata": {
"Author": "Open Telekom Cloud",
"hash": "416721e9718239c08a51bad61a2f254c"
},
"similarityScore": -0.7444106340408325,
"text": "Open Telekom Cloud  Overview of Billing and Price Models13\nThere are different scenarios for direct connections to the Open Telekom Cloud (OTC). With Ethernet Connect one-\nway or two-way (active/ passive) can be connected. MPLS always has a two-way connection in active/passive mode.  \nWe always recommend a redundant connection. Currently, only a two-path active/passive connection is supported.\nBilling unit\none-way two-way\nactive/passive\n1 X\n1 X\n1 X1 X\n2 X2 XQuantity to calculate\nPLAS Bandwidth\nDirect Connect Bandwidth/Port\nSetup  \n(Direct Connect)One-off fee\nFig.14: Calculation volume for PLAS\nDirect Connect and Private Access Link example\n50 MBIT\n100 Mbit\n1 GBitCosts Direct \nConnectCosts Private  \nAccess Linktotal costs per month\n with 2 x DC +1 x PLASBandwith\n68.55 €\n108.56 €77.53 €218.73 €\n483.99 €232.69 €287.28 €\n592.55 €310.22 €\nDomain Name Service \nWith the Domain Name Service, two components are included in the billing: First, the amount of domains created, \nand second, the number of queries/requests. The domains created are billed by the hour, the queries are billed \nper million. With intensive use of the service (more than a billion inquiries per month), users receive a discount \nvia scales.\nOher Network Services\nOther billing models are used for the remaining Open Telekom Cloud network services. Elastic IP, Elastic Load \nBalancer, NAT Gateway Service and VPC Endpoint are charged „pay as you go" according to usage hours.\nSecurity  \nMany Security Services of the Open Telekom Cloud are free, including identity and account management, the  \nanti-DDos-Service, and EVS and OBS encryption.\nKey Management Service\nThe use of Key Management Service (KMS) incurs usage-based charges. The usage of the generated keys is \ncalculated on an hourly basis. The free contingent for KMS comprises 20,000 API calls. In addition, API calls incur \ncharges: 1,000 calls generate costs of 0.3 cents.Already included\n -Cloud Eye\n -Anti-DDoS\n -Identity and \nAccess  \nManagement"
},

{
"pageLabel": "11",
"fileName": "otc.pdf",
"metadata": {
"Author": "Open Telekom Cloud",
"hash": "416721e9718239c08a51bad61a2f254c"
},
"similarityScore": -0.7214524745941162,
"text": "Open Telekom Cloud  Overview of Billing and Price Models11\n                                                                                    \nData Access Speed\nStorage Price Price Request\nData Access Speed\nPrice Request Storage Price\nData Access Speed\nPrice Request Storage Price\n                                                                                                                                                                                          \nFig.10: Speed and Price comparison for Standard, Warm, and Cold Object Storage \nNetwork \nCloud computing is defined as a sourcing model that provides computing and storage capacities from pools via \nnetworks. Accordingly, the network is also an integral component of cloud computing. VPCs separate the resour -\nces of different tenants from each other, VPN and Elastic IP enable secure access via the Internet, etc. Data trans -\nfer to and within the Open Telekom Cloud is free of charge; downloading and sending out data are priced on a \nstepped scale based on data volumes. The entire outbound data volume for a calendar month is added up and \nused as a basis for billing. At Outbound, we differentiate the data volume into to OBS outbound and VPC out -\nbound. We have summarized the free and paid data transfer in the table below.\n--\n-- Inbound, free of charge\nInbound, free of charge Inbound, free of chargeinternal, free of charge OBS-Outbound, fee-based\nVPC-Outbound, fee-based\n--Goal\nOBSOBSSource\nVPCVPC\nInternetInternet\nFig. 11: Data Transfers fee-based and free of charge\nNetwork example\nYou have 100 TB of data stored in the Open Telekom Cloud. In the course of the month, you added another \n15 TB. In the same period, users used the stored data. In total, 20 TB of data were downloaded by users \naccessing the data. Only the downloaded data (Internet traffic outbound upflow) incur charges. This data \ntraffic reaches the fourth billing scale. COLD OBS WARM OBS STANDARD OBS"
},

{
"pageLabel": "1",
"fileName": "otc.pdf",
"metadata": {
"Author": "Open Telekom Cloud",
"hash": "416721e9718239c08a51bad61a2f254c"
},
"similarityScore": -0.720503568649292,
"text": "Open Telekom Cloud  Overview of Billing and Price Models1\nOpen Telekom Cloud  \nOverview of  \nBilling and Price Models"
},

{
"pageLabel": "7",
"fileName": "otc.pdf",
"metadata": {
"Author": "Open Telekom Cloud",
"hash": "416721e9718239c08a51bad61a2f254c"
},
"similarityScore": -0.7178670763969421,
"text": "Open Telekom Cloud  Overview of Billing and Price Models7\n                             \nJuly\n€ 07/1 -7/29\n28 days x 24 h + 17 h = 689 h\n    - 744 h\n        - 55 hReserved Contingent\nCost of ElasticRestOperating Time\ns2.large.4\nIn the reserved upfront mode,  a fixed amount is billed at the start of the term. Contrary to the regular reserved \nmodels it is not possible to switch to a larger instance or a new generation.  \nIn the subsequent months, the reserved upfront instances are also shown on the bill as a reserved upfront \npackage, without being charged for. The credit balance „depletes" over the selected term, with the pro rata \namount available each month (hours x days). The only difference from the example above is that the complete \namount for 8,760 hours (365 days x 24 hours) is billed at the end of the month in which the order is placed.\nDedicated Host\nDedicated hosts are also available on the Open Telekom Cloud. Within the technical parameters, several separate \nvirtual machines (VMs) can be set up and modified on these reserved resources. The bill shows only the licenses \nfor the VMs, no costs are incurred for open Linux. The same billing models as for ECS also apply for dedicated \nhosts: elastic, reserved and reserved upfront, with the rates applicable for the hosts.\nImage Management Service  \nThe use of the Image Management Service is free of charge. But when storing private images the used object \nstorage (OBS) is billed. These costs are integrated into the OBS invoice item. There are no traffic costs; the  \nstorage of public images is of course completely free of charge.\nCloud Container Engine\nThe Cloud Container Engine (CCE) is also classified as a computing service on the Open Telekom Cloud. CCE \nis free for up to 50 nodes (not HA). If a larger cluster or an HA cluster is set up, there are hourly costs for using \nthe service. \nStorage\nThe Open Telekom Cloud offers five storage types: Elastic Volume Service (EVS, block storage -- always linked to \nvirtual machines), Object Storage Service (OBS), Cloud Server/Volume Backup Service (CSBS/VBS), Storage \nDisaster Recovery Services (SDRS), and Scalable File Storage (SFS). The pricing models vary between the object \nstorage and the block-storage offers (EVS, VBS). In general, object storage is VM-independent and a more cost-\neffective storage option, whereas block storage enables fast data access thanks to a directly connected virtual \nhard drive.\nThe reference period here is also the relevant calendar month. The average volume of allocated storage is deter -\nmined (in GB) and used as the basis for billing. In the block storage option, prices rise on a linear basis in line \nwith the data volume. For object storage, prices are based on (discounted) stepped scales. The scales are of \nspecific sizes and stack on top of each other. The higher the scale, the lower the costs for the storage volumes \nallocated in that scale. The exception to this rule is that the basic volume of 5 GB for standard object storage is \nfree of charge.Already included\n5 GB Basic \nVolume  \nOBS StandardAlready included\nCloud Container \nEngine (CCE)"
},

{
"pageLabel": "8",
"fileName": "otc.pdf",
"metadata": {
"Author": "Open Telekom Cloud",
"hash": "416721e9718239c08a51bad61a2f254c"
},
"similarityScore": -0.7160099744796753,
"text": "Open Telekom Cloud  Overview of Billing and Price Models8\ntotal \ncost\n5 GB 1 TB 10 TB 50 TB 500 TB 1PB 5 PB 10 PB\n(not true to scale)EVS, \nVBS, SFS OBS\nFig.6: Price pattern of EVS and OBS in comarison with increasing use \nCALCULATION BASIS\nAccess options\nPrice pattern\nCost of RequestOBS Criteria\namount of data\ndeclining scaled pricesaccess from internet\ncost per request\nexemption for 5 GBallocated storage/backup volume\nfixed/GB -  linear  no access from the internet/  \ndirect connection to ECS VM\ninclusive\n(SAS,SSD,SATA, \nSAS boosted, SSD boosted)EVS/VBS\nSpecial Characteristics\nFig.7: Comparison OBS and EVS/VBS\nElastic Volume Storage/Volume Backup Storage/Scalable File Service \nElastic Volume Storage (EVS) can be ordered in three classes. Prices also vary according to access speed. \nMonthly costs are determined on the one hand by the allocated storage volumes in GB, and on the other by the \nduration of their use. I.e., if the storage is only provided for half a month, then only half of the costs are incurred. \nElastic Volume Storage is deemed to be provided even if the associated instance is stopped, as long as the storage  \nhas not been deleted. EVS pricing model is also applied to Scalable File Service and Cloud Server Backup Service.\nOnly the ordered/allocated storage volumes are relevant when calculating the EVS, not the specific data \nvolumes. There is a difference using VBS or SFS: In this cases only the actual stored data is billed.\nThe amount of storage used (OBS) or allocated (EVS) is multiplied by the number of usage hours and divided by \nthe total number of hours for the month. This gives the average storage used in a month.\nThe average storage allocated in a month is multiplied by the basic price per GB. This results in a linear increase \nin costs as the volume of allocated storage increases.Pricing Model \nalso valid for \nSFS and CSBS"
}
]
}
© Deutsche Telekom AG